The Language Resource Life Cycle: Towards a Generic Model for Creating, Maintaining, Using and Distributing Language Resources
نویسنده
چکیده
Georg Rehm DFKI GmbH Berlin, Germany [email protected] Abstract Language Resources (LRs) are an essential ingredient of current approaches in Linguistics, Computational Linguistics, Language Technology and related fields. LRs are collections of spoken or written language data, typically annotated with linguistic analysis information. Different types of LRs exist, for example, corpora, ontologies, lexicons, collections of spoken language data (audio), or collections that also include video (multimedia, multimodal). Often, LRs are distributed with specific tools, documentation, manuals or research publications. The different phases that involve creating and distributing an LR can be conceptualised as a life cycle. While the idea of handling the LR production and maintenance process in terms of a life cycle has been brought up quite some time ago, a best practice model or common approach can still be considered a research gap. This article wants to help fill this gap by proposing an initial version of a generic Language Resource Life Cycle that can be used to inform, direct, control and evaluate LR research and development activities (including description, management, production, validation and evaluation workflows).
منابع مشابه
Training Language Teachers: An educational semiotic model
Abstract The changing culture toward multimodality enforces acquiring visual literacy in every aspect of today’s modern life. One of the fields intermingled with using various modes in different variations is language teaching and learning, especially for and by young learners. Young language learners’ (5-12 years old) lack of world experience forces them to make the most use of non-verbal mode...
متن کاملA Supervised Method for Constructing Sentiment Lexicon in Persian Language
Due to the increasing growth of digital content on the internet and social media, sentiment analysis problem is one of the emerging fields. This problem deals with information extraction and knowledge discovery from textual data using natural language processing has attracted the attention of many researchers. Construction of sentiment lexicon as a valuable language resource is a one of the imp...
متن کاملTraining Language Teachers: An educational semiotic model
Abstract The changing culture toward multimodality enforces acquiring visual literacy in every aspect of today’s modern life. One of the fields intermingled with using various modes in different variations is language teaching and learning, especially for and by young learners. Young language learners’ (5-12 years old) lack of world experience forces them to make the most use of non-verbal mode...
متن کاملAttitudes towards English Language Norms in the Expanding Circle: Development and Validation of a new Model and Questionnaire
This paper describes the development and validation of a new model and questionnaire to measure Iranian English as a foreign language learners’ attitudes towards the use of native versus non-native English language norms. Based on a comprehensive review of the related literature and interviews with domain experts, five factors were identified. A draft version of a questionnaire based on those f...
متن کاملTowards a System for Dynamic Language Resources in LOD
Formalization and representation of the language resources life cycle in a formal language to support the creation, update and application of the language resource instances is made possible via the developments in the area of ontologies and Linked Open Data. In the paper we present some of the basic functionalities of a system to support dynamic language resources.
متن کامل